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Two Objectives guided the present studyi (1) to 
provide a suitable test for the hypothesis rho^O^ and (2) to 
establish a means by which general users of R can set confldenee 
intervals on rho. The first objective was approached by testing 
several possible solutions similar to the procedure followed by 
Forsyth* The second objective was pursued via a combination of a 
general analytical procedure (Mood and Grayblll^ 1963) together with 
computer simulation techniques and a curve fitting technique (tJsow^ 
1970). Procedurea for achieving both objectives required the use of R 
distributions. The method used to obtain the necessary R 
distributions and the two procedures and their reaults are described. 
(Author/DB) 
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It is well kno™ that tlie imianltudfS of a cDrrclaLion cnufHcnanl 
is affected substantially by group varinbility. To 2id;just for Hroup 
variability, Pearson (1903) derived a formula for R, a corrcintinn 
coefficient corrected for restriction of range. The correctian precod^^d 
the onset of inferential techniques and was davelopad to eatlinate p for 
an unrestricted population in situationB where coTUpletG selection and 
1^ criterion data were available only frun a selected 'txtrcniity of the populn- 

K-^^ tlon. Essentially, it was a descriptive statlotiCs provided the asH^uTtiptiDnc 

of a bivarlate normal populationj homoscedagtlcd ty of variance errors- and 
^ linearity of reBrassion were, met. However, R is most often used to GsUiniiiLa 

p when both SGlection mid critcrioii data arc available only f'ror,i a samjilc of 
a selected extremty. of the population. Consequently, although R was 
g developed for descriptive purposes, its current use is primarily as an 

inferential statistic. 

As a descriptivB Btatistic, R will not equal p only if the underlying 
assuniptions have not been met. Two empirical studias (Hovls, 1935; CreaBe)-, 
1953) illustrated that in aavaral cases Involving large N's the assumption?; 
are met well enough so that R descnribes p very well. The sanie sltudtion 
does not hold for inferentinl stntistics. Duo to the presmicc of sampling 
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error, ona doon ml D;-:prc:t n Bi;ai:iyt;ic to equal tho paramDCcr cvun in 
^QHon wliDrc thct underlying a^:nm\ption:: Imvu bacn cxciclily met. For 
thin roc^BOn^ R is a inGaninn^vil infcrontinl statistic only to tho oxti^nt: 
th/tt an invc^itJ.nator can specify boundnries within wliich p v^ill lie, 
PreyGiitly^ inveotigatojrs cnnnot dctGrmine thn accurncy of corrGctcui 
corrolatioii coeJf icicint, bac^uBu iL:s infurentlal charactGristicci 
have not bac^n asseBs^ed. As a reisulc, Lord am Novick (1968 , pi 147) 
stated J , .a morG cautlnun attitude toward th:*se formulas io ■ 
called for in any applicationa in v^hich thy iratio of: standard duviatioiis 
In tha unseirictad gioup to BUandard d^iViationa in tha selactcd group 
is mora than 1.40* This condition corresponds to a selGctlon oj: approx-- 
imately the upppr 70 percent from a standard normal population," 

It has been difficult to explicate the propc^rties of R priTnarily 
bccauco R Is dependGnt upon three paraTnetGro; sample Kise (N) ^ tlie 
corroilation coefficient between the two variabilis X and Y in th(2 unre- 
stricted population (p), and. tho percentile point of the X variabla such 
that all X values included in the explicitly selected sample are larger 
than A Lp(A)^], The Interdependency among those factors causes Intractable 
mathematical probleiiis that have thus far precluded an analytical solution 
to the dsnsity function of R, 

Only one study (Forsyth, 1971) has been undertaken to clarify the 
inferential propertlea of lU Using computer simulation methods to sinipiy 
test the efficacy of a hypothesised " solution ^ Fisher's log transformation 
of R to set confidence intervals on p ^ it \^as estabi/ished that the pro- 
cedure does not praducG suitable accuraue confidence intervals*- An 



attempt: to "correct:" tlie formula by adjustinii tlie dagrcGfii of froedom 
in UliG Z B[:ni:itilic impruvycl tlic^ rohjulusj but did not provide n dcf in- 
iLivo solution to the intc?;val ei^lrimation problnm* 

Tv^o objactiveig guided the present study i (1) To pruvida a suit:ablD 
test for tho hypot^hDsis p^O, and (2) To establish a moann by vhich 
gcineral users of R can set confidcncG iutervnls on p* The first objec-- 
tive was approached by testing several posGible solutions similar to 
tha procedure followed by Forsyth. The second objective was pursued 
via a combination of ii general analytical procedure (Mood & GrDybill, 
1963) together with computer siniulation techniques and a curve fitting 
technique (Usow^ 1970), Procedures for achieving both objectives 
required t:he use of R distributions* The method used to obtain the 
necessary R distributions is described itniiiediately below. Following 
thats the two procedures and their results are described. 

There arc two olternntive forniulas that can bi used to obtain R 
values J and hence estimate p. One is in common use^ a deficriptlon of 
it; can be found in Gulllksen (1S50) and several other sourcGS, A 
second formula described by Kelley (1923) yields f.pi)roKlmately the 
same value of R as does the conventional formula hvc la more difficult 
to use and generally resulted, in less acceptable tests of p^O ' 
(Gullicksonj 1971), For those reason&Vj only the proGedurGs and results 
as they pertain to the conventiODal formuln are described in this paper. 
(Lower case letters denote values from the restricted group; capital 
letters indicate values for the unrestricted group.) 



rorayCh (1971) fouriJ R distiribuuiony o])Lainc-d by using a^. in tha 
□bovG foinnuln dlircrcd licula from U dli:Lribij t^inns ubLaincd by utilug 
Sj., hnnca, r;^^ wa« luied in i:hP pla^'.- of Cor ihc cnniputat ion of ali R 
v/d-ues KiiKM? it: nlmplif i^id thc: procc^duirc ond rcducyd coinputcr cost«* 

Valuer of r and s.^ obtniiicd uBing n not of iiornial davintnRi 

NCOjl) (Collins, 1970) togt^^t. with a random iiumber genciratior (Jordan^ 
1970) and a GomputQr stimulation metliod for obtaining corrclatGd pairs 
from a population of paired varintQs having a corrc-lntion of p CLnhman 
& Bailey, 1968, p, 220). In this procciss, K variatca of thn (X.Y) 
pairs were always randomly sclectad from the population of X valuGS ' 
greiater than P(A)* Each r value thus obtained was then corrected for 
explicit selection via the conventional foraula to oh tain R. The pro^ 
cedurQ was rcplicoted, holding N, and P(A) conBtant, a dasicnatcd 
number of times to produca a distribution of R sumpla point estiniates of p 

Teatinf; the Hy potho plG p-Q 

As indicated in Table 1^ R distributions, each composed of 1,000 
sample pointSi were formed for p^Q^ n^Z7 , 52^ lOOj and for P(A)^aO, , 
,50, ,75. ■ . 

.Insert Table 1 about here 

Five tGBt statistics were applied to the R sample points in Gach of the 
nine distributions In ah attumpt to find the adnquacj'^ of the test 
statistics in termG of actual si^niflcancG levctls bGina equal to respec-- 
tivo nominal slgniflcaDts levels. The five teiit statistics are listed belo 
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TABLIi 1 

Number of Replications Uaed for Duilding R 
Distributions of Estlmrites of p When p"0 



NumbLn^ of R 
Sample Points 
N Per Distribution p(A) 



27 1,000 .10, .50, .75 

52 ■ 1,000 ,10, ,50, .75 

100 1*000 ^ AO, ,50, .75 



(1) a «s - v.'hC!Ve 0 ^ 1 f 

(2) z - - whercs cj vK-1 ( 2 | Kolluv ■ (1923, p. 316) 

R / 9 . 

(3) t = ^ whuira S = / l-H" 



(''0 t - If where B = /l-r^ /K(l-R^)\ Kelloy (1923, p. 316) 



' ^ \/m [ 2 
rd^r ), 
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(5) 2 ^ i^litiro Z is tha FiBher log transformation of 

^ \ _^ 

11 and a ^ l/\hl-^3 



Of the five formulas 1^ 3, and 5 'twere cornplGtely unacce^ptable: the 
actual Type I error probabilities substantially axcafided the nominal 
significance Icivels. 'Formulcis 2 aiH howGvar, provad to bo 
more accurate. Although formula 2 yialds a z statistic, and fomula 
4 yields a t static tic ^ both formulae utilise the Kalley foOTula to 
ohtain tha standard orrDr of and tha results obtained from the two 
procedures were quite; roiiiparAblQ, Both exhibited a Blmllar trend and 
In no cans did qm i?ppear to be fsignif icantly better than the other. 
The average difference in Typci I error probability between the results 
6f the two formulas was only .0014, Because the two forinulas are so 
similai: only the results fur furnmla 4 are included here (see Table 2), 
RgsuIi.ij for formulas 2^ 3^ and S are given by Gulllckson (1971). 



Insert Tabic 2 about here 
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TABLE 2 

Actual Probability of a Type I Error for Testing 
the Ilypotliesis p^O with a T^^o-Tailad t-Test 



for Various 


Sample ! 


Si'ms and Explicit 


Seiaction 


Points 






t 


\|r (l-r^) CN=.2) 






















1=r2 








Nominal Probability of 
— — - — — - 


Type-I 


Error .01 


.05 


.10 


.20 




N 


Actual Probabilities 


of Type- 


-I Error 


.10 


27 


.026 


.063 


.115 


.207 


.10 


52 


.014 


.054 


.099 


.171 


.10 


100 


.008 


.052 


.098 


.206 


.50 


27 


.048 


.099 


.155 


.219 ■ 


.50, 


52 


.021 


.069 


.112 


.181 


'.50 


100 


.016 


.059 


.107 


.207 


.75 


27 


.069 


.135 


.180 


..246 


.75 


52 


.032 


.086 


.130 


.,198 


.75 


100 


.022 


.069 


.119 


,223 



l-P(A) is th© proportion of the unrGstricted sample employed. 



In cGnartil, fomula h providus a liboral test of a.^ i.c*, thiS 
, actual ch£incci of a Typo I arroi: Iw [^reatc^r Lhan the stntcd nominni 
level. The diocrapnncy hcLwnen tha^nclual ^incl ncrninal aioniricniica 
levols bccDmes Igbb pronouncfnd an any one or combination of tlie 
fnllowlng occur- ■(!) a. ih lucraosed in iiiaf^nit-udo, (2) N is IncrcaBad 
in slzoi, (3) T>(A) is raduced. For exQiiiplej when N°27 and a", 01 the 
estimated actual Bigiiillrance level docrQaBccl from .069 to *026 as 
PCA) reduced from ,75 to *10, "P'or the Scimc a. but N=100, tha 
estiniated actuol signif icEincc level wa« .022 ^.Thcn P(A)^.7S but rGduced 
to •008 when P(A)^aOi For rGscarch purposesp It is icnommended that 
either formula 2 or 4 be employed for hypothesis tectings purpDt.^:^S3 
but that no test be made if both N and tha propDirtion in the cKplicitly 
selected sample aroi small. 

int erval Eigtiimfi t n on on p 

-Since analytic mean^ of sattliig confido^nca Interv^l^ ^re not 
avallabi,e, four sample si^es N^25, 50,- 100, and 200, six explicit 
selGctlon points such that P(A)-^. 10, . 20, ,40, .60, .75, and ,90, and 
ten correlations p-0, ,1, ,2, - • - .9 wGre uaed in all poKsible 
combinations^ to produce a" total of 260 R dietributiona (see Table 3), 

Insert Table 3 about here 



Thosd 240 distributions were in turn unod to build 24 confidencg Interval 
nomograms for each of two a valuen (a^.O! Rud .05). All nomogroms for 
a set combination of a and P(A) werG thdn placGd on one H vs. p axis to 
provide a total of 12 mtB of noTno^^rams at^ provided in Flcures 2-13. 

To build a single 1-a confidence interval nomogram, the ten R distrl- 
butions for p^O, .1, .2, ' - . 9 and a eingle 'cnmbination of K and P(A) 
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TABLE 3 . - 
Kui;ibi;r of lie p lie fit Ion s Uaed in Buildlnfi Each 
R DiLstribution i;ov InLcrval i:bl:imat:ion on p 



N 






P(A) 








p 




Nuraber of 
Raplicatiions 
per Distribution 


25 


.10, 


.20, 


./,o, 


.60, .75, 


.90 


0, 


.1, 




.9 


10,000 


50 


. 10, 


.20, 


,40, 


.60, .75, 


.90 


0. 


.1. 


.2, ■ • 


S .9 


6,000 


100 


.10, 


.20, 


.40, 


.60 




0. 




.2, • • 


.9 


1,500 


100 


,75, 


.90 








0, 


.1, 


,2, • • 


S .9 


3,000 


200 


.10, 


.20, 


.40, 


.60 




0, 


.1, 


.2. ' ■ 


.9 


750 


200 


,75, 


.90 








0, 


.1. 


.2, . . 


.9 


1,500 



Note: Kach diDtribunlQn was foimed using a single cambinaLioii of N, P(A) and p e.^., 
uiiing rhfi CQinblnation ^25, P(A)^s].0, and p^O, 10,000 replications were mada to form an 
R distribution. 



v^'cre used (e.g., the ten p valiiGS for N^:?5 and P(A)^JQ). Nine of tlie 
ten R dlatributionB, thn^o derived inider Ihe cDndlCioiu] p^.l, ,2^ ,3, 
• ' S .9, ware eSKtjnLlnlly ut;ed t:wlco. BecaiH^e n is Hymmctrical nboul: 
ZQVO, each R vnlue of the disLribution^: could be inultlplcd by -1 to 
produce new distributions corresponding to p^-.l^ -^.2, --.3^ ' • -.9. 

The lower bound of each cDnfidence interval nomogram was formed by 
doteimlning the a/2 percentile point; of each of the 19 R distributions, 
pairing each with the p value it estimated, and using those 19 number 
pairs with the two additional <R,p) pairs (-1,-^1) and (1,1) to derive a 
polynomial line of best fit* The line forming an upper confidence 
intet-N^al bound was obtained in the same manner eKcept the l-'a/2 'percentil 
points of the R distributions were used, Those two lines on an R vs* 
p axis form confidence interval bounds on p. Figure 1, an illuitratlve- 
confidence interval nomogram, allows one to set a ,95 confidence interval 
on p for an R obtained from an eKplicltly selected sainple of N^25 and 
P(A)^,1, If under the specified conditions, a user obtained an R value oi 
.5 the confidence Interval on p would have lower and upper bounds of --.02 
and *75 raspectlvelyi ' . - 

^ Insert Figuro 1 about here 

A single 1-a .nomogram provides precise confidence Interval on p only 
for a set combination of N and P(A), Obviously there are an Infinite 
number of such combinations^, and no single nomogram would exactly fit 
more than a few cases. However ^ the combination of four nomographs on 
a single axis allows a user to interpolate and set confidGnce intervals 
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Fig* 1, ThG 95% confldQnce Intervals nround R, corrected for 
rcstrxcuion d£ range ^ on p for N ^ 25^ vrhen P(A) « ,10. (Find 
'tlm uppGi liniit valua abovG the Principal diaBonnl and the 
lovmr limit vnlue bGlow it*) . 
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for a wide rnnga of sample sIkgs. Also, t]m series of Figures 2-^7, and 
8-13, allow a usier to intarpolatc across values of P(A)i By interpolating 
within and across figuros, a usar can set confidencG intGrvals on p for 
any R regardless of the snitiple sIxg or the ciitoff point used for explicit 
selection. = 

As can be noted from Figures 2-135 the variancG of R incrensos as 
P(A) Is increaaad. That phGnomcnon appears to be tha cause of an increasing 
amount of error in the polynomial linos of be5>t fit (FigurGs 2=^13) as P(A) 
gets large. Because the variance of the R distribution Ib mlich larger 
for P(A)^,90 thnn for PCA)-J.0, the precision with which the a/2 and 1 - a/2 
points of the R distribution were located was correspondingly decreased* 
As is noted in Table 3j a very inrge number of eaniple polnta per diotributlon 
was obtained for all sample siaes when P(A)^.75 and ,90 in an atteiiipt to 
overcome that problGiii. The following empirical check illustrates that the 
errors, though larg© for P(A)^,90p do not materially reduce the precision 
of the respective confidenice interval nomograms » 

^ ^ As a check on the eiiipirically obtaincad confidence intervals a simulation 
of 10,000 repiicatlonB was run for N^25, PCA)^,90, and p^.GS, -.75, and ,85. 
The obtainc^d points were placad in thoir reGpective positions for 'the lower 
line on FiRura 7\ The largoiit difference between an obtained poli^t and. 
the lluGs ,0B^ occurred when p^«*75 on Figuro 7. Note that if that empirical 
chQckpoint were used instead of the linei it would rebuilt in a cqnfldcmcc 
intervnli on being longer by only iipproKimntcly ,03 on the lov^cr tnil 
and ,02 on the upper tail. That Gorraspandw to an error of ^3,3 percent 
in the conf ifleneo interval (totnl error, ,03^ divided iiy total confidence ' 
interval width, 1,!;), Obviously thci poJ.ynomial liney riijprQach vertical. 
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any error would produce a large percentaga error in terms of the total 

Gonfidonce interval Imgtlu However, as can be seen, the diffsrGncG ^■ 

betwGcn the checkpoints and the line is nGQllgible in the region of the i 

lorga slope. } 

-f 

Insert Figures 2-^13 about hero j 

As was noted at the beginning of this article ^ an indicator of 1 

"goodness" for an inferential statistic is the width of its resulting \ 

confidence intGrval on the desired parameter. The width of R's confidence j 

Interval on p .are very dependent on both P(A) and N, The relationship, f 

although visible in Figures 2-1 3 , may be seen more clearly in Floure 14, ^ 

To obtain Figure 14 5 the 95 percent CDnCidence intervals were measured ir 

Cor set H and H valuun (Figures 8-13) and then plotted against the I 

■> 

respective P (A) valueg. (Confidence interval widths at F(A)--0 were ' I 

obtained from a table of confidence intervals about r on p (Glass & f 
Stanley^ 1970, p, 537) because when P(A)^0, R is a true Pearson r,) | 

Insert Figure 14 about here ' \ = I 

The relatlonshipa illustrated by Figure 14 can be summarised in • f 

four seneralii^ationei ' , 1 

= 1 

1* For constant P(A) and as N is Increased in si^e the I 

■5 

confidence intorval decreases * . ' - I 

2. An P(A) is incrcufied, i,e,, the proportion in tlie explicitly \ 

scl(,u;ted BamplG decreasesj the cDnfidence interval width * ^ 

incr£!a:3o« proportionately, . In avery cana the oeriee of ^ I 

points for a not combination of R and N indicatod a \- 

linear rGlationjiihip beiiween confidence interv^il width I 

nnd P(A). ' ^ i 1 
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Fig* 2, The 99!^ conf idencc! intGrvnls around corrQctGd for 
rGfitrtction of tniK^Gj on p for N ^ 25 s 50, 100, ciud 200 vAmn 
J?(A) - ,10, (Find the upper limit valuG above the Principal 
diagonal and the lower liniit value belov' it J 
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Fio- 3i The 99% confidGnce intp.rvalp around corrcictacl for 
ircntriction of rnnse, on p for H 25^ SOj 100, nnd 200 whc^n 
PCA) ^ ,20» (Find the upper liiniti value nbovcr the Priucipnl 
diiigonal and tba lovGr limit: valuoMnelow ±\L,y 



ERIC 



16 



Hi 



11 ^.2 



,50 

2^ 
M 




I i I I 



_1 



.CO 



t 



Tl 

1 ! 
i j 



j i { f i j 



X 



i ! 1 ' ^ 



.,f|.,.LU. 



,1/1 ^; ^ 

-Ltl ..... 



! ^ 



1 



I /I 1/ 



r 



' I * 



i 



y ! 1 



trii 



l__ 



j I 



tf» iri »J 



lf> i if* f-* &3 



Fig. 4, The 99% confideticc intdryals around corroctad for 
irasCricUlon of rangG, on p for N ^ 25, 50^ 100, and z'OO when 
- .^fO* (Pind tlm uppGr limit yaluc above the Principal 
dianonal and t:]iG lower limit: value bGlow It,) 
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5,. Thd 99% confidRnce inUervals around corrected for 
restriction of range , on p for N ^ 25, 50, 100 , and 200 when 
P(A) ^ ,60. (Find the upper littilt vnluG above the rrincipnl 
diagonal and tlia lov?Gr llitiit^ valua below it,) 
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Fig, 6, TIiG 90% confidence intervals nraund' R, correcUGd Cor 
restjriction of rnnpa, on P for M ^ 25, 50, 100, and 200 uhen 
3^ (A) » .75, /"^ (Find tha uppeir linit value* nbovG tha Principal 
diagonal and tho lower limit Valuo below iti) 
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Fig, 7. The 99% confidence lnterv?ils around R, corrtecCed for 
rcsJtiriction of ranga, on p for N 2S, SO, 100, and 200 uhen 
P(A) .« .90. (Find tho, upper limit vaIue above t\m Principal 
diagonal and the lawer limit value belay it,) 
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rig, 8, TliG 95% cDnfidencG inliervnls AMimd 11, corrQctcd for 
raotirictlon of ranno, on p for N ^ 25^ 50^ lOO, and 200 vrh^m 
P(A) .» .10. (Finu tha upjmr limit volue abovG tlm Principal 
diagonal and the lower limit value bolow it,) 
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Fig. '9* Tlie 95% confidGnce intervnls around corractGd for 
restriction of raiip.e, on p for N = 25^ 50, 100, and 200 when 
r(A) - .20* , (rind the uppei: limit value above the rrinGipol 
diagonal nnd the lower linlt" value it*) 
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Fig, 10, The 95% confidence intervnrs around cdrirecteEd for 
restriction of range, on p for N - 25, 50, 100, and 200 \-:hen 
P(A) ,^iO» (Find thp. uppGr limit value above the rrincipnl 
divngonal and Llie lower limit vnlue dgIov? it,) 
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11, The 95% confidencG Intervals around corrected for 
restriction of range, on p for N « 23, 50, 100, and 200 \m^n 
P(A) - (Find the upper: limit value above the Principal 

diagonaL and the loV7or limit vaiuc below it*) 
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Tig. 12. The 95r^ confidGncG Intervals around cairrpctGd for 
rGStricitlon of ranee, on p for N ^ 25, 50, IQO, and 200 when • 
I'CA) ^ ^75, (Find the upper XiTnit vnlue above the Principgil 
diagonal and . the lower limit: valuQ below ItO 
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Fig, 13. Tim 93% conCldancet inUGrvals nrouiid Rj corrected for 
restriction of rnngOj on p for N ^ 25, 50, 100, and 200 when 
P(A)^ .90, (rind the uppGr limit value nbove the Principal 
din^nnnl and the lovctr limit value below it*) 
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S- A graphic i-oprcsentntlon of, tho ef facts 

N, flnd R on confidcncG intarval width f 
confidence intervals on p. 
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3, The rate of ahan[^e in confidence interval width per unit 
change in P(A) (the slope of the line) is clependGnt upon 
N and clecre'aees as N is increased in BizQ. Note the top 
four lines of Figure 14, R^O for ciich of the four lines j 
and as N increases from N^25 for the top line to N^200 
for the fourth lino, the slope gradually but noticeably 
deereasas* 

4* The rata of change in confidence Interval width per unit 
change in P(A) is dependent on R and decreases as R 
Increases in size. Note the bottom three lines of Figure 
14. .For each of the three lines with N-200, ^^?lth R^O, 
R-.Sj and R^,?^ note that as R \b incrGased, the slope 
of the line decreases. 
Points 2, 3s and 4 make It clear that when there is explicit 
selection on one variable, a considerable price is exacted in terms 
.of the precision with which Inferential statements about p can be mad 
The greater P(A) becomes, the greater will be the corresponding loss 
of precision. By increasing H the loss of precision caused by in- 
creasing P(A) can be reduced; also, the loss of precision per unit 
change in P(A) Is decreased as R increases. However, it appears that 
only as N becomes very large and R approaches +1 will the effects of 
selection be negligible. 

One additional point should be notecL Because of the large 
confidence intervals on p, when N is small and P(A) Is large, the 
obtained R may have liutle practical value, other than to prevent 



2a 

Dvur-onirorprctation of Gaiuplci II valuer. Tor o^:cili;plc5 Khon l>25, and 
ViA)^.90, npf: uritU \K\ U i\rcatQL thnn .84 can p he Bald Co be 
difforant from ^cro at; 99 pGrccnl: Invtil af cnnfj donee. 

CerLalnly, cliaractarlfjtrlcs jujit clG^uribad rGprGfient: a very raal 
iii;provcni^iiL :ln our knovjlind^^o of tlio inforunuinl propGrliiGs of R, .For 
the uscirs who daHiires t:u QKt:lniatG p fvom nn wiplicitly scloctcd sample, 
the noiTiograms (Flguiafi 2-13) provldu an Gfricifint and accurate mmns 
of ^icttlng thosK CDrifidcnca intervals once R has htien calculated. It 
is obvlQua the- uornDgram^i are not t]m tiltlninta clamant solution, a 
simple confidence interval formula woiild be much better; but the 
mth \rtiich thsy can ha appliod iiialteK them a viablci aid to most 
practitioners. 
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